%infrastructure
%\subsection{System infrastructure}
%Our system comprises of a varying number of cores

Our simulations were carried out on a multicore system comprising primarily of Intel Ivybridge-like cores.
Table~\ref{tab:syspara} lists the system configuration that we ran for our default simulations


\begin{table}[ht!]\scriptsize
\centering
\begin{center}
\begin{tabular}{|c|c|} \hline
%Workload Type & \\
Processor & Ivybridge Microarchitecture 	\\ \hline
L1 Cache & 32 KB D/I, 8 way S.A \\ \hline
L2 Cache & 256 KB private Cache  \\ \hline
L3 Cache & 2MB/Core shared LLC, 16 way S.A \\ \hline 
DRAM & 4GB, DDR3-1600, 1 mem channel \\ \hline
% & per core; 4MB shared LLC 
% & L1 hit latency: 1 cycles; L2 hit latency: 8 cycles\\ \hline
%Memory & 4GB; DDR2-1600; 1 memory channel; \hline
\end{tabular}
\caption {Configuration of the evaluation platform.}
\label{tab:syspara}
\end{center}
\vspace{-0.2in}
\end{table}


\subsection{Architectural simulation setup and benchmarks}
We used the Sniper-5.0~\cite{sniper} system simulation tool for our performance simulations. 
This tool is integrated with McPAT-0.8~\cite{mcpat} which is used for power and area estimation.
We instrumented the McPAT technology file with parameters obtained from our TCAD simulations described in Section~\ref{sec:background}. These parameters are listed in Table~\ref{tab:device-params}.
In addition McPAT interfaces with Cacti, providing timing models for every processor component as well as wires, which we use to determine the critical path delay.

%\begin{table}[ht!]\scriptsize
%\centering
%\begin{center}
%\begin{tabular}{|c|c|c|} \hline
%%Workload Type & \\
%Parameter & Si FinFET & HTFET	\\ \hline
%EOT  (nm) &  0.7 & 0.7\\ \hline
%$C_{ox} (fF/\mu m)$  & 1.28 & 1.28\\ \hline
%$L_g$ (nm) & 26 & 26\\ \hline
%$V_{th}$ (V) & 0.25-0.3 & 0.1 (eff) \\ \hline 
%$V_{d-sat}$ (V) & 0.419 & 0.288 \\ \hline 
%$R_{on} (K\omega-\mu m)$ & 1.01 & 2.43 \\ \hline 
%$I_{on} (mA-\mu m)$ & 0.71 & 0.166 \\ \hline
%$C_{g-ideal}  (fF/\mu m)$ & 0.55 & 0.327 \\ \hline
%Source Doping ($cm^{-3}$)& 1e20 (n+) & 4e19 (GaSb p+) \\ \hline
%Drain Doping ($cm^{-3}$)& 1e20 (n+) & 8e17 (InAs n+) \\ \hline
%% & per core; 4MB shared LLC 
%% & L1 hit latency: 1 cycles; L2 hit latency: 8 cycles\\ \hline
%%Memory & 4GB; DDR2-1600; 1 memory channel; \hline
%\end{tabular}
%\caption {Technology parameters.}
%\label{tab:device-params}
%\end{center}
%\vspace{-0.2in}
%\end{table}

\begin{table}[ht!]\scriptsize
\centering
\begin{center}
\begin{tabular}{|c|c|c|c|c|c|} \hline
%Workload Type & \\
Parameter & FinFET & HTFET & Parameter & FinFET & HTFET	\\ \hline
$C_{ox}$(fF/$\mu$m) & 1.28 & 1.28 & $L_g$(nm) & 26 & 26\\ \hline
$V_{th}$(V) & 0.25-0.3 & 0.1 (eff) & $V_{d-sat}$(V) & 0.419 & 0.288 \\ \hline 
$R_{on}$(K$\omega$-$\mu$m) & 1.01 & 2.43 & $I_{on}$(mA-$\mu$m) & 0.71 & 0.166 \\ \hline
$C_{g-ideal}$(fF/$\mu$m) & 0.55 & 0.327 & EOT(nm) &  0.7 & 0.7 \\ \hline
Source & 1e20 & 4e19 & Drain & 1e20 & 8e17 \\ 
Doping($/cm^3$) & n+ & GaSb p+ & Doping($/cm^3$) & n+ & InAs n+ \\ \hline
%Doping(D)($/cm^3$) & 1e20 (n+) & 8e17(n+) &&&\\ \hline
% & per core; 4MB shared LLC 
% & L1 hit latency: 1 cycles; L2 hit latency: 8 cycles\\ \hline
%Memory & 4GB; DDR2-1600; 1 memory channel; \hline
\end{tabular}
\caption {Technology parameters.}
\label{tab:device-params}
\end{center}
\vspace{-0.2in}
\end{table}


For the purpose of obtaining thermal profiles, we created periodic traces using Sniper and created a power profile by running McPAT on each individual trace.
Our processor logic and wire models, described in Section~\ref{sec:technique} were used to obtain the corresponding TFET numbers from the CMOS core simulations.
These power profiles were then used as input to Hotspot3D for obtaining temperature variations across the processor.



